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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7257 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS : double 

(D) Topology: linear 

(ii) MOLECULE TYPE: nucleic acid 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1072.. 4284 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

TCAATATTGG CCATTAGCCA TATTATTCAT TGGTTATATA GCATAAATCA 
ATATTGGCTA 60 

TTGGCCATTG CATACGTTGT ATCTATATCA TAATATGTAC ATTTATATTG 
GCTCATGTCC 12 0 

AATATGACCG CCATGTTGGC ATTGATTATT GACTAGTTAT TAATAGTAAT 
CAATTACGGG 180 

GTCATTAGTT C ATAGC C CAT ATATGGAGTT CCGCGTTACA TAAC TTACGG 
TAAATGGCCC 24 0 

GCCTGGCTGA CCGCCCAACG ACCCCCGCCC ATTGACGTCA ATAATGACGT 
ATGTTCCCAT 300 

AGTAACGCCA ATAGGGACTT TCCATTGACG TCAATGGGTG GAGTATTTAC 
GGTAAACTGC 3 6 0 

CCACTTGGCA GTACATCAAG TGTATCATAT GCCAAGTCCG CCCCCTATTG 
ACGTCAATGA 42 0 

CGGTAAATGG CCCGCCTGGC ATTATGCCCA GTACATGACC TTACGGGACT 
TTCCTACTTG 480 

GCAGTACATC TACGTATTAG TCATCGCTAT TACCATGGTG ATGCGGTTTT 
GGCAGTACAC 54 0 

CAATGGGCGT GGATAGCGGT TTGACTCACG GGGATTTCCA AGTCTCCACC 
CCATTGACGT 60 0 

CAATGGGAGT TTGTTTTGGC ACCAAAATCA ACGGGACTTT CCAAAATGTC 
GTAATAACCC 66 0 

CGCCCCGTTG ACGCAAATGG GCGGTAGGCG TGTACGGTGG GAGGTCTATA 
TAAGCAGAGC 72 0 

TCGTTTAGTG AACCGTCAGA TCACTAGAAG CTTTATTGCG GTAGTTTATC 
ACAGTTAAAT 78 0 

TGCTAACGCA GTCAGTGCTT CTGACACAAC AGTCTCGAAC TTAAGCTGCA 
GAAGTTGGTC 84 0 

GTGAGGCACT GGGCAGGTAA GTATCAAGGT TACAAGACAG GTTTAAGGAG 
ACCAATAGAA 90 0 




ACTGGGCTTG TCGAGACAGA GAAGACTCTT GCGTTTCTGA TAGGCACCTA 
TTGGTCTTAC 960 

TGACATCCAC TTTGCCTTTC TCTCCACAGG TGTCCACTCC CAGTTCAATT 
ACAGCTCTTA 102 0 

AGGCTAGAGT ACTTAATACG ACTCACTATA GGCTAGCGAA GGAGATCCGC C ATG 
GCC 1077 

Met Ala 



CAC CAT CAC CAC CAT CAC GGA TAT CCA TAC GAC GTG CCA GAT TAC GCT 
1125 

His His His His His His Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
5 10 15 

CAG TCG AGT GCC ATG AGT AAA GGA GAA GAA CTT TTC ACT GGA GTT GTC 
1173 

Gin Ser Ser Ala Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val 
20 25 30 



CCA ATT CTT GTT GAA TTA GAT GGT GAT GTT AAT GGG CAC AAA TTT TCT 
1221 

Pro lie Leu Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser 
35 40 45 50 

GTC AGT GGA GAG GGT GAA GGT GAT GCA ACA TAC GGA AAA CTT ACC CTT 
1269 

Val Ser Gly Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu 
55 60 65 

AAA TTT ATT TGC ACT ACT GGA AAA CTA CCT GTT CCT TGG CCA ACA CTT 
1317 

Lys Phe lie Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu 

70 75 80 

GTC ACT ACT TTC ACT TAT GGT GTT CAA TGC TTT TCA AGA TAC CCA GAT 
1365 

Val Thr Thr Phe Thr Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp 
85 90 95 

CAT ATG AAA CAG CAT GAC TTT TTC AAG AGT GCC ATG CCC GAA GGT TAT 
1413 

His Met Lys Gin His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr 
100 105 110 

GTA CAG GAA AGA ACT ATA TTT TTC AAA GAT GAC GGG AAC TAC AAG ACA 
1461 

Val Gin Glu Arg Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr 
115 120 125 " 130 

CGT GCT GAA GTC AAG TTT GAA GGT GAT ACC CTT GTT AAT AGA ATC GAG 
1509 

Arg Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu 
135 140 145 



TTA AAA GGT ATT 
1557 

Leu Lys Gly lie 
150 

TTG GAA TAC AAC 
1605 

Leu Glu Tyr Asn 
165 

CAA AAG AAT GGA 
1653 

Gin Lys Asn Gly 
180 

GAT GGA AGC GTT 
1701 

Asp Gly Ser Val 
195 

GGC GAT GGC CCT 
1749 

Gly Asp Gly Pro 



TCT GCC CTT TCG 
1797 

Ser Ala Leu Ser 
230 

CTT GAG TTT GTA 
1845 

Leu Glu Phe Val 
245 




GAT TTT AAA GAA 
Asp Phe Lys Glu 

TAT AAC TCA CAC 

Tyr Asn Ser His 
170 

ATC AAA GTT AAC 

lie Lys Val Asn 
185 

CAA CTA GCA GAC 

Gin Leu Ala Asp 
200 

GTC CTT TTA CCA 

Val Leu Leu Pro 
215 

AAA GAT CCC AAC 
Lys Asp Pro Asn 

ACA GCT GCT GGG 

Thr Ala Ala Gly 
250 



GAT GGA AAC ATT 

Asp Gly Asn lie 
155 

AAT GTA TAC ATC 
Asn Val Tyr lie 

TTC AAA ATT AGA 

Phe Lys lie Arg 
190 

CAT TAT CAA CAA 

His Tyr Gin Gin 
205 

GAC AAC CAT TAC 

Asp Asn His Tyr 
220 

GAA AAG AGA GAC 

Glu Lys Arg Asp 
235 

ATT ACA CAT GGC 
He Thr His Gly 




CTT GGA CAC AAA 

Leu Gly His Lys 
160 

ATG GCA GAC AAA 

Met Ala Asp Lys 
175 

CAC AAC ATT GAA 
His Asn He Glu 

AAT ACT CCA ATT 

Asn Thr Pro He 
210 

CTG TCC ACA CAA 

Leu Ser Thr Gin 
225 

CAC ATG GTC CTT 

His Met Val Leu 
240 

ATG GAT GAA CTA 

Met Asp Glu Leu 
255 



TAC AAA GGC GCC GGC GCT GGT GCT GGT GCT GGC GCC ATC AGC GCG CTG 
1893 

Tyr Lys Gly Ala Gly Ala Gly Ala Gly Ala Gly Ala He Ser Ala Leu 

260 265 270 

ATC CTG GAC TCC AAA GAA TCC TTA GCT, CCC CCT GGT AGA GAC GAA GTC 
1941 

He Leu Asp Ser Lys Glu Ser Leu Ala Pro Pro Gly Arg Asp Glu Val 

275 280 285 290 

CCT GGC AGT TTG CTT GGC CAG GGG AGG GGG AGC GTA ATG GAC TTT TAT 
1989 

Pro Gly Ser Leu Leu Gly Gin Gly Arg Gly Ser Val Met Asp Phe Tyr 

295 300 305 

AAA AGC CTG AGG GGA GGA GCT ACA GTC AAG GTT TCT GCA TCT TCG CCC 
2037 

Lys Ser Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser Ser Pro 
310 315 320 




TCA GTG GCT GCT GCT TCT CAG GCA GAT TCC AAG CAG CAG AGG ATT CTC 
2085 

Ser Val Ala Ala Ala Ser Gin Ala Asp Ser Lys Gin Gin Arg lie Leu 
325 330 335 

CTT GAT TTC TCG AAA GGC TCC ACA AGC AAT GTG CAG CAG CGA CAG CAG 
2133 

Leu Asp Phe Ser Lys Gly Ser Thr Ser Asn Val Gin Gin Arg Gin Gin 
340 345 350 

CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG 
2181 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 

355 360 365 370 

CAG CCA GGC TTA TCC AAA GCC GTT TCA CTG TCC ATG GGG CTG TAT ATG 
2229 

Gin Pro Gly Leu Ser Lys Ala Val Ser Leu Ser Met Gly Leu Tyr Met 

375 380 385 

GGA GAG ACA GAA ACA AAA GTG ATG GGG AAT GAC TTG GGC TAC CCA CAG 
2277 

Gly Glu Thr Glu Thr Lys Val Met Gly Asn Asp Leu Gly Tyr Pro Gin 
390 395 400 

CAG GGC CAA CTT GGC CTT TCC TCT GGG GAA ACA GAC TTT CGG CTT CTG 
2325 

Gin Gly Gin Leu Gly Leu Ser Ser Gly Glu Thr Asp Phe Arg Leu Leu 

405 410 415 

GAA GAA AGC ATT GCA AAC CTC AAT AGG TCG ACC AGC GTT CCA GAG AAC 
2373 

Glu Glu Ser lie Ala Asn Leu Asn Arg Ser Thr Ser Val Pro Glu Asn 

420 425 430 

CCC AAG AGT TCA ACG TCT GCA ACT GGG TGT GCT ACC CCG ACA GAG AAG 
2421 

Pro Lys Ser Ser Thr Ser Ala Thr Gly Cys Ala Thr Pro Thr Glu Lys 

435 440 445 450 

GAG TTT CCC AAA ACT CAC TCG GAT GCA TCT TCA GAA CAG CAA AAT CGA 
2469 

Glu Phe Pro Lys Thr His Ser Asp Ala Ser Ser Glu Gin Gin Asn Arg 
455 460 465 

AAA AGC CAG ACC GGC ACC AAC GGA GGC AGT GTG AAA TTG TAT CCC ACA 
2517 

Lys Ser Gin Thr Gly Thr Asn Gly Gly Ser Val Lys Leu Tyr Pro Thr 
470 475 480 



GAC CAA AGC ACC TTT GAC CTC TTG AAG GAT TTG GAG TTT TCC GCT GGG 
2565 

Asp Gin Ser Thr Phe Asp Leu Leu Lys Asp Leu Glu Phe Ser Ala Gly 
485 490 495 



# 



TCC CCA AGT AAA GAC ACA AAC GAG AGT CCC TGG AGA TCA GAT CTG TTG 
2613 

Ser Pro Ser Lys Asp Thr Asn Glu Ser Pro Trp Arg Ser Asp Leu Leu 
500 505 510 

ATA GAT GAA AAC TTG CTT TCT CCT TTG GCG GGA GAA GAT GAT CCA TTC 
2661 

lie Asp Glu Asn Leu Leu Ser Pro Leu Ala Gly Glu Asp Asp Pro Phe 
515 520 525 530 

CTT CTC GAA GGG AAC ACG AAT GAG GAT TGT AAG CCT CTT ATT TTA CCG 
2709 

Leu Leu Glu Gly Asn Thr Asn Glu Asp Cys Lys Pro Leu lie Leu Pro 
535 540 545 

GAC ACT AAA CCT AAA ATT AAG GAT ACT GGA GAT ACA ATC TTA TCA AGT 
2757 

Asp Thr Lys Pro Lys lie Lys Asp Thr Gly Asp Thr lie Leu Ser Ser 

550 555 560 

CCC AGC AGT GTG GCA CTA CCC CAA GTG AAA ACA GAA AAA GAT GAT TTC 
2805 

Pro Ser Ser Val Ala Leu Pro Gin Val Lys Thr Glu Lys Asp Asp Phe 
565 570 575 

ATT GAA CTT TGC ACC CCC GGG GTA ATT AAG CAA GAG AAA CTG GGC CCA 
2853 

lie Glu Leu Cys Thr Pro Gly Val He Lys Gin Glu Lys Leu Gly Pro 
580 585 590 

GTT TAT TGT CAG GCA AGC TTT TCT GGG ACA AAT ATA ATT GGT AAT AAA 
2901 

Val Tyr Cys Gin Ala Ser Phe Ser Gly Thr Asn He He Gly Asn Lys 
595 600 605 610 

ATG TCT GCC ATT TCT GTT CAT GGT GTG AGT ACC TCT GGA GGA CAG ATG 
2949 

Met Ser Ala He Ser Val His Gly Val Ser Thr Ser Gly Gly Gin Met 
615 620 625 

TAC CAC TAT GAC ATG AAT ACA GCA TCC CTT TCT CAG CAG CAG GAT CAG 
2997 

Tyr His Tyr Asp Met Asn Thr Ala Ser Leu Ser Gin Gin Gin Asp Gin 
630 635 640 

AAG CCT GTT TTT AAT GTC ATT CCA CCA ATT CCT GTT GGT TCT GAA AAC 
3045 

Lys Pro Val Phe Asn Val He Pro Pro He Pro Val Gly Ser Glu Asn 

645 650 655 

TGG AAT AGG TGC CAA GGC TCC GGA GAG GAC AGC CTG ACT TCC TTG GGG 
3093 

Trp Asn Arg Cys Gin Gly Ser Gly Glu Asp Ser Leu Thr Ser Leu Gly 

660 665 670 

GCT CTG AAC TTC CCA GGC CGG TCA GTG TTT TCT AAT GGG TAC TCA AGC 
3141 



Ala Leu Asn Phe Pro Gly Arg Ser Val Phe Ser Asn Gly Tyr Ser Ser 
675 680 685 690 

CCT GGA ATG AGA CCA GAT GTA AGC TCT CCT CCA TCC AGC TCG TCA GCA 
3189 

Pro Gly Met Arg Pro Asp Val Ser Ser Pro Pro Ser Ser Ser Ser Ala 
695 700 705 



GCC ACG GGA CCA CCT CCC AAG CTC TGC CTG GTG TGC TCC GAT GAA GCT 
3237 

Ala Thr Gly Pro Pro Pro Lys Leu Cys Leu Val Cys Ser Asp Glu Ala 
710 715 720 

TCA GGA TGT CAT TAC GGG GTG CTG ACA TGT GGA AGC TGC AAA GTA TTC 
3285 

Ser Gly Cys His Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys Val Phe 

725 730 735 

TTT AAA AGA GCA GTG GAA GGA CAG CAC AAT TAC CTT TGT GCT GGA AGA 
3333 

Phe Lys Arg Ala Val Glu Gly Gin His Asn Tyr Leu Cys Ala Gly Arg 
740 745 750 

AAC GAT TGC ATC ATT GAT AAA ATT CGA AGG AAA AAC TGC CCA GCA TGC 
3381 

Asn Asp Cys He He Asp Lys He Arg Arg Lys Asn Cys Pro Ala Cys 

755 760 765 770 

CGC TAT CGG AAA TGT CTT CAG GCT GGA ATG AAC CTT GAA GCT CGA AAA 
3429 

Arg Tyr Arg Lys Cys Leu Gin Ala Gly Met Asn Leu Glu Ala Arg Lys 
775 780 785 

ACA AAG AAA AAA ATC AAA GGG ATT CAG CAA GCC ACT GCA GGA GTC TCA 
3477 

Thr Lys Lys Lys He Lys Gly He Gin Gin Ala Thr Ala Gly Val Ser 

790 795 800 

CAA GAC ACT TCG GAA AAT CCT AAC AAA ACA ATA GTT CCT GCA GCA TTA 
3525 

Gin Asp Thr Ser Glu Asn Pro Asn Lys Thr He Val Pro Ala Ala Leu 
805 810 815 

CCA CAG CTC ACC CCT ACC TTG GTG TCA CTG CTG GAG GTG ATT GAA CCC 
3573 

Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu Glu Val He Glu Pro 

820 825 830 

GAG GTG TTG TAT GCA GGA TAT GAT AGC TCT GTT CCA GAT TCA GCA TGG 
3621 

Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser Ala Trp 

835 840 845 850 

AGA ATT ATG ACC ACA CTC AAC ATG TTA GGT GGG CGT CAA GTG ATT GCA 
3669 



Arg lie Met Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val lie Ala 
855 860 865 

GCA GTG AAA TGG GCA AAG GCG ATA CTA GGC TTG AGA AAC TTA CAC CTC 
3717 

Ala Val Lys Trp Ala Lys Ala lie Leu Gly Leu Arg Asn Leu His Leu 
870 875 880 

GAT GAC CAA ATG ACC CTG CTA CAG TAC TCA TGG ATG TTT CTC ATG GCA 
3765 

Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp Met Phe Leu Met Ala 

885 890 895 

TTT GCC TTG GGT TGG AGA TCA TAC AGA CAA TCA AGC GGA AAC CTG CTC 
3813 

Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser Ser Gly Asn Leu Leu 
900 905 910 

TGC TTT GCT CCT GAT CTG ATT ATT AAT GAG CAG AGA ATG TCT CTA CCC 
3861 

Cys Phe Ala Pro Asp Leu lie lie Asn Glu Gin Arg Met Ser Leu Pro 

915 920 925 930 



GGC ATG TAT GAC 
3909 

Gly Met Tyr Asp 



CAA AGA TTG CAG 
3957 

Gin Arg Leu Gin 
950 

CTG CTT CTC TCC 
4005 

Leu Leu Leu Ser 
965 

TTT GAT GAG ATT 
4053 

Phe Asp Glu He 
980 



GTC AAA AGG GAA 
4101 

Val Lys Arg Glu 
995 

CTG ACA AAG CTT 
4149 

Leu Thr Lys Leu 



ACC TAC TGC TTC 



CAA TGT AAA CAC 

Gin Cys Lys His 
935 

GTA TCC TAT GAA 
Val Ser Tyr Glu 

TCA GTT CCT AAG 

Ser Val Pro Lys 
970 

CGA ATG ACT TAT 

Arg Met Thr Tyr 
985 

GGG AAC TCC AGT 

Gly Asn Ser Ser 
1000 

CTG GAC TCC ATG 

Leu Asp Ser Met 
1015 

CAG ACA TTT TTG 



ATG CTG TTT GTC 

Met Leu Phe Val 
940 

GAG TAT CTC TGT 

Glu Tyr Leu Cys 
955 

GAA GGT CTG AAG 
Glu Gly Leu Lys 

ATC AAA GAG CTA 

He Lys Glu Leu 
990 

CAG AAC TGG CAA 

Gin Asn Trp Gin 
1005 

CAT GAG GTG GTT 

His Glu Val Val 
1020 

GAT AAG ACC ATG 



TCC TCT GAA TTA 

Ser Ser Glu Leu 
945 

ATG AAA ACC TTA 

Met Lys Thr Leu 
960 

AGC CAA GAG TTA 

Ser Gin Glu Leu 
975 

GGA AAA GCC ATC 
Gly Lys Ala He 

CGG TTT TAC CAA 

Arg Phe Tyr Gin 
1010 

GAG AAT CTC CTT 

Glu Asn Leu Leu 
1025 

AGT ATT GAA TTC 




4197 

Thr Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr Met Ser lie Glu Phe 
1030 1035 1040 

CCA GAG ATG TTA GCT GAA ATC ATC ACT AAT CAG ATA CCA AAA TAT TCA 
4245 

Pro Glu Met Leu Ala Glu lie He Thr Asn Gin He Pro Lys Tyr Ser 
1045 1050 1055 

AAT GGA AAT ATC AAA AAG CTT CTG TTT CAT CAA AAA TGA CTGCCTTACT 
4294 

Asn Gly Asn He Lys Lys Leu Leu Phe His Gin Lys * 
1060 1065 1070 

AAGAAAGGTT GCCTTAAAGA AAGTTGAATT TATAGTCTAG AGTCGACCCG 
GGCGGCCGCT 43 54 

TCGAGCAGAC ATGATAAGAT ACATTGATGA GTTTGGACAA ACCACAACTA 
GAATGCAGTG 4414 

AAAAAAATGC TTTATTTGTG AAATTTGTGA TGCTATTGCT TTATTTGTAA 
C C ATTAT AAG 44 74 

CTGCAATAAA CAAGTTAACA ACAACAATTG CATTCATTTT ATGTTTCAGG 
TTCAGGGGGA 453 4 

GATGTGGGAG GTTTTTTAAA GCAAGTAAAA CCTCTACAAA TGTGGTAAAA 
TCGATAAGGA 45 94 

TCCGGGCTGG CGTAATAGCG AAGAGGCCCG CACCGATCGC CCTTCCCAAC 
AGTTGCGCAG 4654 

CCTGAATGGC GAATGGACGC GCCCTGTAGC GGCGCATTAA GCGCGGCGGG 
TGTGGTGGTT 4 714 

ACGCGCAGCG TGACCGCTAC ACTTGC C AGC GCCCTAGCGC CCGCTCCTTT 
CGCTTTCTTC 4774 

CCTTCCTTTC TCGCCACGTT CGCCGGCTTT CCCCGTCAAG CTCTAAATCG 
GGGGCTCCCT 4 834 

TTAGGGTTCC GATTTAGAGC TTTACGGCAC CTCGACCGCA AAAAACTTGA 
TTTGGGTGAT 4 8 94 

GGTTCACGTA GTGGGCCATC GCCCTGATAG ACGGTTTTTC GCCCTTTGAC 
GTTGGAGTCC 4 954 

ACGTTCTTTA AT AGTGGAC T CTTGTTCCAA ACTGGAACAA CACTCAACCC 
TATCTCGGTC 5 014 

TATTCTTTTG ATTTATAAGG GATTTTGCCG ATTTCGGCCT ATTGGTTAAA 
AAATGAGCTG 5 074 

ATTTAACAAA TATTTAACGC GAATTTTAAC AAAATATTAA CGTTTACAAT 
TTCGCCTGAT 5134 



GCGGTATTTT CTCCTTACGC ATCTGTGCGG TATTTCACAC CGCATATGGT 



GCACTCTCAG 5194 

TACAATCTGC TCTGATGCCG CATAGTTAAG CCAGCCCCGA CACCCGCCAA 
CACCCGCTGA 5 2 54 

CGCGCCCTGA CGGGCTTGTC TGCTCCCGGC ATCCGCTTAC AGACAAGCTG 
TGACCGTCTC 5314 

CGGGAGCTGC ATGTGTCAGA GGTTTTCACC GTCATCACCG AAACGCGCGA 
GACGAAAGGG 5374 

CCTCGTGATA CGCCTATTTT TATAGGTTAA TGTCATGATA ATAATGGTTT 
CTTAGACGTC 5434 

AGGTGGCACT TTTCGGGGAA ATGTGCGCGG AACCCCTATT TGTTTATTTT 
TCTAAATACA 5494 

TTCAAATATG TATCCGCTCA TGAGACAATA ACCCTGATAA ATGCTTCAAT 
AATATTGAAA 5554 

AAGGAAGAGT ATGAGTATTC AACATTTCCG TGTCGCCCTT ATTCCCTTTT 
TTGCGGCATT 5 614 

TTGCCTTCCT GTTTTTGCTC ACCCAGAAAC GCTGGTGAAA GTAAAAGATG 
CTGAAGATCA 5 67 4 

GTTGGGTGCA CGAGTGGGTT ACATCGAACT GGATCTCAAC AGCGGTAAGA 
TCCTTGAGAG 5 734 

TTTTCGCCCC GAAGAACGTT TTCCAATGAT GAGCACTTTT AAAGTTCTGC 
TATGTGGCGC 5 794 

GGTATTATCC CGTATTGACG CCGGGCAAGA GCAACTCGGT CGCCGCATAC 
ACTATTCTCA 5 854 

GAATGACTTG GTTGAGTACT CACCAGTCAC AGAAAAGCAT CTTACGGATG 
GCATGACAGT 5 914 

AAGAGAATTA TGCAGTGCTG C C AT AAC CAT GAGTGATAAC ACTGCGGCCA 
ACTTACTTCT 5 974 

GACAACGATC GGAGGAC CGA AGGAGCTAAC CGCTTTTTTG CACAACATGG 
GGGATCATGT 6 034 

AACTCGCCTT GATCGTTGGG AACCGGAGCT GAATGAAGCC ATACCAAACG 
ACGAGCGTGA 60 94 

CACCACGATG CCTGTAGCAA TGGCAACAAC GTTGCGCAAA CTATTAACTG 
GCGAACTACT 6154 

TACTCTAGCT TCCCGGCAAC AATTAATAGA CTGGATGGAG GCGGATAAAG 
TTGCAGGACC 6214 

ACTTCTGCGC TCGGCCCTTC CGGCTGGCTG GTTTATTGCT GATAAATCTG 
GAGCCGGTGA 62 74 

GCGTGGGTCT CGCGGTATCA TTGCAGCACT GGGGCCAGAT GGTAAGCCCT 



CCCGTATCGT 63 34 



AGTTATCTAC ACGACGGGGA GTCAGGCAAC TATGGATGAA CGAAATAGAC 
AGATCGCTGA 63 94 

GATAGGTGCC TCACTGATTA AGCATTGGTA ACTGTCAGAC CAAGTTTACT 
CATATATACT 64 54 

TTAGATTGAT TTAAAACTTC ATTTTTAATT TAAAAGGATC TAGGTGAAGA 
TCCTTTTTGA 6514 

TAATCTCATG ACCAAAATCC CTTAACGTGA GTTTTCGTTC CACTGAGCGT 
CAGACCCCGT 65 74 

AGAAAAGATC AAAGGATCTT CTTGAGATCC TTTTTTTCTG CGCGTAATCT 
GCTGCTTGCA 6634 

AACAAAAAAA CCACCGCTAC CAGCGGTGGT TTGTTTGCCG GATCAAGAGC 
TACCAACTCT 6694 

TTTTCCGAAG GTAACTGGCT TCAGCAGAGC GCAGATACCA AATACTGTCC 
TTCTAGTGTA 67 54 

GCCGTAGTTA GGCCACCACT TCAAGAACTC TGTAGCACCG CCTACATACC 
TCGCTCTGCT 6814 

AATCCTGTTA CCAGTGGCTG CTGCCAGTGG CGATAAGTCG TGTCTT AC CG 
GGTTGGACTC 68 74 

AAGACGATAG TTACCGGATA AGGCGCAGCG GTCGGGCTGA ACGGGGGGTT 
CGTGCACACA 6934 

GCCCAGCTTG GAGCGAACGA CCTACACCGA ACTGAGATAC CTACAGCGTG 
AGCTATGAGA 6994 

AAGCGCCACG CTTCCCGAAG GGAGAAAGGC GGACAGGTAT CCGGTAAGCG 
GCAGGGTCGG 70 54 

AACAGGAGAG CGCACGAGGG AGCTTCCAGG GGGAAACGCC TGGTATCTTT 
ATAGTCCTGT 7114 

CGGGTTTCGC CACCTCTGAC TTGAGCGTCG ATTTTTGTGA TGCTCGTCAG 
GGGGGCGGAG 7174 

CCTATGGAAA AACGCCAGCA ACGCGGCCTT TTTACGGTTC CTGGCCTTTT 
GCTGGCCTTT 72 34 

TGCTCACATG GCTCGACAGA TCT 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1071 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala His His His His His His Gly Tyr Pro Tyr Asp Val Pro Asp 
15 10 15 

Tyr Ala Gin Ser Ser Ala Met Ser. Lys Gly Glu Glu Leu Phe Thr Gly 
20 25 30 

Val Val Pro lie Leu Val Glu Leu Asp Gly Asp Val Asn Gly His Lys 
35 40 45 

Phe Ser Val Ser Gly Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu 
50 55 60 

Thr Leu Lys Phe lie Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro 



Thr Leu Val Thr Thr Phe Thr Tyr Gly Val Gin Cys Phe Ser Arg Tyr 
85 90 95 

Pro Asp His Met Lys Gin His Asp Phe Phe Lys Ser Ala Met Pro Glu 
100 105 110 

Gly Tyr Val Gin Glu Arg Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr 
115 120 125 

Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg 
130 135 140 

lie Glu Leu Lys Gly lie Asp Phe Lys Glu Asp Gly Asn lie Leu Gly 

145 150 155 ' 160 

His Lys Leu Glu Tyr Asn Tyr Asn Ser His Asn Val Tyr lie Met Ala 
165 170 175 



Asp Lys Gin Lys 
180 

He Glu Asp Gly 
195 

Pro He Gly Asp 
210 

Thr Gin Ser Ala 
225 

Val Leu Leu Glu 



Glu Leu Tyr Lys 
260 

Ala Leu He Leu 
275 



Asn Gly He Lys 



Ser Val Gin Leu 
200 

Gly Pro Val Leu 
215 

Leu Ser Lys Asp 
230 

Phe Val Thr Ala 
245 

Gly Ala Gly Ala 



Asp Ser Lys Glu 
280 



Val Asn Phe Lys 
185 

Ala Asp His Tyr 



Leu Pro Asp Asn 
220 

Pro Asn Glu Lys 
235 

Ala Gly lie Thr 
250 

Gly Ala Gly Ala 
265 

Ser Leu Ala Pro 



He Arg His Asn 
190 

Gin Gin Asn Thr 
205 

His Tyr Leu Ser 



Arg Asp His Met 
240 

His Gly Met Asp 
255 

Gly Ala He Ser 
270 

Pro Gly Arg Asp 
285 



Glu Val Pro Gly Ser Leu Leu Gly Gin Gly Arg Gly Ser Val Met Asp 
290 295 300 

Phe Tyr Lys Ser Leu Arg Gly Gly Ala Thr Val Lys Val Ser Ala Ser 
305 310 315 320 

Ser Pro Ser Val Ala Ala Ala Ser Gin Ala Asp Ser Lys Gin Gin Arg 
325 330 335 

lie Leu Leu Asp Phe Ser Lys Gly Ser Thr Ser Asn Val Gin Gin Arg 
340 345 350 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
355 360 365 

Gin Gin Gin Pro Gly Leu Ser Lys Ala Val Ser Leu Ser Met Gly Leu 
370 375 380 

Tyr Met Gly Glu Thr Glu Thr Lys Val Met Gly Asn Asp Leu Gly Tyr 
385 390 395 400 

Pro Gin Gin Gly Gin Leu Gly Leu Ser Ser Gly Glu Thr Asp Phe Arg 
405 410 415 

Leu Leu Glu Glu Ser lie Ala Asn Leu Asn Arg Ser Thr Ser Val Pro 
420 425 430 

Glu Asn Pro Lys Ser Ser Thr Ser Ala Thr Gly Cys Ala Thr Pro Thr 
435 440 445 



Glu Lys Glu Phe Pro Lys Thr His Ser Asp Ala Ser Ser Glu Gin Gin 
450 455 460 

Asn Arg Lys Ser Gin Thr Gly Thr Asn Gly Gly Ser Val Lys Leu Tyr 
465 470 475 480 

Pro Thr Asp Gin Ser Thr Phe Asp Leu Leu Lys Asp Leu Glu Phe Ser 
485 490 495 

Ala Gly Ser Pro Ser Lys Asp Thr Asn Glu Ser Pro Trp Arg Ser Asp 
500 505 510 

Leu Leu lie Asp Glu Asn Leu Leu Ser Pro Leu Ala Gly Glu Asp Asp 
515 520 525 

Pro Phe Leu Leu Glu Gly Asn Thr Asn Glu Asp Cys Lys Pro Leu lie 
530 535 540 

Leu Pro Asp Thr Lys Pro Lys lie Lys Asp Thr Gly Asp Thr lie Leu 
545 550 555 560 

Ser Ser Pro Ser Ser Val Ala Leu Pro Gin Val Lys Thr Glu Lys Asp 
565 570 575 



Asp Phe lie Glu Leu Cys Thr Pro Gly Val lie Lys Gin Glu Lys Leu 




580 



585 



590 



Gly Pro Val Tyr Cys Gin Ala Ser Phe Ser Gly Thr Asn lie lie Gly 
595 600 605 

Asn Lys Met Ser Ala lie Ser Val His Gly Val Ser Thr Ser Gly Gly 
610 615 620 

Gin Met Tyr His Tyr Asp Met Asn Thr Ala Ser Leu Ser Gin Gin Gin 
625 630 635 640 

Asp Gin Lys Pro Val Phe Asn Val lie Pro Pro lie Pro Val Gly Ser 
645 650 655 

Glu Asn Trp Asn Arg Cys Gin Gly Ser Gly Glu Asp Ser Leu Thr Ser 
660 665 670 

Leu Gly Ala Leu Asn Phe Pro Gly Arg Ser Val Phe Ser Asn Gly Tyr 
675 680 685 

Ser Ser Pro Gly Met Arg Pro Asp Val Ser Ser Pro Pro Ser Ser Ser 
690 695 700 

Ser Ala Ala Thr Gly Pro Pro Pro Lys Leu Cys Leu Val Cys Ser Asp 
705 710 715 720 

Glu Ala Ser Gly Cys His Tyr Gly Val Leu Thr Cys Gly Ser Cys Lys 
725 730 735 

Val Phe Phe Lys Arg Ala Val Glu Gly Gin His Asn Tyr Leu Cys Ala 



Gly Arg Asn Asp Cys lie He Asp Lys He Arg Arg Lys Asn Cys Pro 
755 760 765 

Ala Cys Arg Tyr Arg Lys Cys Leu Gin Ala Gly Met Asn Leu Glu Ala 
770 775 780 

Arg Lys Thr Lys Lys Lys He Lys Gly He Gin Gin Ala Thr Ala Gly 
785 790 795 800 

Val Ser Gin Asp Thr Ser Glu Asn Pro Asn Lys Thr He Val Pro Ala 
805 810 815 

Ala Leu Pro Gin Leu Thr Pro Thr Leu Val Ser Leu Leu Glu Val He 
820 825 830 

Glu Pro Glu Val Leu Tyr Ala Gly Tyr Asp Ser Ser Val Pro Asp Ser 
835 840 845 

Ala Trp Arg He Met Thr Thr Leu Asn Met Leu Gly Gly Arg Gin Val 
850 855 860 



740 



745 



750 



He Ala Ala Val Lys Trp Ala Lys Ala He Leu Gly Leu Arg Asn Leu 
865 870 875 880 



# 




His Leu Asp Asp Gin Met Thr Leu Leu Gin Tyr Ser Trp Met Phe Leu 
885 890 895 

Met Ala Phe Ala Leu Gly Trp Arg Ser Tyr Arg Gin Ser Ser Gly Asn 
900 905 910 

Leu Leu Cys Phe Ala Pro Asp Leu lie He Asn Glu Gin Arg Met Ser 
915 920 925 

Leu Pro Gly Met Tyr Asp Gin Cys Lys His Met Leu Phe Val Ser Ser 
930 935 940 

Glu Leu Gin Arg Leu Gin Val Ser Tyr Glu Glu Tyr Leu Cys Met Lys 
945 950 955 960 

Thr Leu Leu Leu Leu Ser Ser Val Pro Lys Glu Gly Leu Lys Ser Gin 
965 970 975 

Glu Leu Phe Asp Glu He Arg Met Thr Tyr He Lys Glu Leu Gly Lys 
980 985 990 

Ala He Val Lys Arg Glu Gly Asn Ser Ser Gin Asn Trp Gin Arg Phe 
995 1000 1005 

Tyr Gin Leu Thr Lys Leu Leu Asp Ser Met His Glu Val Val Glu Asn 
1010 1015 1020 

Leu Leu Thr Tyr Cys Phe Gin Thr Phe Leu Asp Lys Thr Met Ser He 
1025 1030 1035 104( 

Glu Phe Pro Glu Met Leu Ala Glu He He Thr Asn Gin He Pro Lys 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
GCGCGCTGAT CAGAATTCCT TTTAGGAATT CTGATCAGCG CGCTGA 46 



(2) INFORMATION FOR SEQ ID NO : 4 : 



1045 



1050 



1055 



Tyr Ser Asn 



Gly Asn He Lys Lys Leu Leu Phe His Gin Lys 
1060 1065 1070 




(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: oligonucleotide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
AGAACANNNT GTTCT 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: oligonucleotide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
AGGTCANNNT GACCT 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : oligonucleotide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 6 : 

TCGAGCGCGC AAGAACACAG TGTTCTGACG ACACGAAGAA CAGGATGTTC 
TCGTACAGTG 60 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: oligonucleotide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



TCGACACTGT ACGAGAACAT CCTGTTCTTC GTGTCGTCAG AACACTGTGT 
TCTTGCGCGC 60 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: oligonucleotide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

TCGAGCGCGC AAGGTCACAG TGACCTGACG ACACGAAGGT CAGGATGACC 
TCGTACAGTG 6 0 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 



TCGACACTGT ACGAGGTCAT CCTGACCTTC GTGTCGTCAG GTCACTGTGA 
CCTTGCGCGC 60 
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